Goto

Collaborating Authors

 target velocity


e197fe307eb3467035f892dc100d570a-Supplemental-Conference.pdf

Neural Information Processing Systems

The process for calculating these metrics is described in Appendix C. Moreover, to ensure the comparability between prediction performance metrics and driving performance metrics in the radar plot, we normalize all metrics to the scale of [0, 1]. In the subsequent section, we provide an overview of the DESPOT planner. These two values can only be inferred from history. The safety is represented by the normalized collision rate.



Multi-robot Rigid Formation Navigation via Synchronous Motion and Discrete-time Communication-Control Optimization

Yang, Qun, Liew, Soung Chang

arXiv.org Artificial Intelligence

Abstract--Rigid-formation navigation of multiple robots is essential for applications such as cooperative transportation. This process involves a team of collaborative robots maintaining a predefined geometric configuration, such as a square, while in motion. For untethered collaborative motion, inter-robot communication must be conducted through a wireless network. Notably, few existing works offer a comprehensive solution for multi-robot formation navigation executable on microprocessor platforms via wireless networks, particularly for formations that must traverse complex curvilinear paths. T o address this gap, we introduce a novel "hold-and-hit" communication-control framework designed to work seamlessly with the widely-used Robotic Operating System (ROS) platform. It operates over discrete-time communication-control cycles, making it suitable for implementation on contemporary microprocessors. Complementary to hold-and-hit, we propose an intra-cycle optimization approach that enables rigid formations to closely follow desired curvilinear paths, even under the nonholonomic movement constraints inherent to most vehicular robots. The combination of hold-and-hit and intra-cycle optimization ensures precise and reliable navigation even in challenging scenarios. Simulations in a virtual environment demonstrate the superiority of our method in maintaining a four-robot square formation along an S-shaped path, outperforming two existing approaches. Furthermore, real-world experiments validate the effectiveness of our framework: the robots maintained an inter-distance error within 0.069m and an inter-angular orientation error within 19.15 Notably, the proposed hold-and-hit framework and optimized nonholonomic motion paradigms are generalizable and extendable to a wide range of multi-robot collaboration problems beyond those studied here.



A Proofs

Neural Information Processing Systems

We therefore can drop the latter term from our bound. Consider the Cliff problem of Swamy et al. [2021]. Note that under Asymptotic Realizability (Assumption 5.1), there exists a policy We specialize on the two-arm case as it is the most difficult for the learner. When this limit exists, the average over timesteps of moment-matching error is equal to it. We give the off-policy learners 25 demonstration trajectories, each of length 1000.


Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion

Bohlinger, Nico, Kinzel, Jonathan, Palenicek, Daniel, Antczak, Lukasz, Peters, Jan

arXiv.org Artificial Intelligence

On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots. However, the computational constraints of real-time learning on robots pose a significant challenge. We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ. We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits. While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion. We demonstrate the robustness of our approach in different indoor and outdoor environments.


Adaptive World Models: Learning Behaviors by Latent Imagination Under Non-Stationarity

Gospodinov, Emiliyan, Shaj, Vaisakh, Becker, Philipp, Geyer, Stefan, Neumann, Gerhard

arXiv.org Artificial Intelligence

Developing foundational world models is a key research direction for embodied intelligence, with the ability to adapt to non-stationary environments being a crucial criterion. In this work, we introduce a new formalism, Hidden Parameter-POMDP, designed for control with adaptive world models. We demonstrate that this approach enables learning robust behaviors across a variety of non-stationary RL benchmarks. Additionally, this formalism effectively learns task abstractions in an unsupervised manner, resulting in structured, task-aware latent spaces.


Online Optimization of Central Pattern Generators for Quadruped Locomotion

Zhang, Zewei, Bellegarda, Guillaume, Shafiee, Milad, Ijspeert, Auke

arXiv.org Artificial Intelligence

Typical legged locomotion controllers are designed or trained offline. This is in contrast to many animals, which are able to locomote at birth, and rapidly improve their locomotion skills with few real-world interactions. Such motor control is possible through oscillatory neural networks located in the spinal cord of vertebrates, known as Central Pattern Generators (CPGs). Models of the CPG have been widely used to generate locomotion skills in robotics, but can require extensive hand-tuning or offline optimization of inter-connected parameters with genetic algorithms. In this paper, we present a framework for the \textit{online} optimization of the CPG parameters through Bayesian Optimization. We show that our framework can rapidly optimize and adapt to varying velocity commands and changes in the terrain, for example to varying coefficients of friction, terrain slope angles, and added mass payloads placed on the robot. We study the effects of sensory feedback on the CPG, and find that both force feedback in the phase equations, as well as posture control (Virtual Model Control) are both beneficial for robot stability and energy efficiency. In hardware experiments on the Unitree Go1, we show rapid optimization (in under 3 minutes) and adaptation of energy-efficient gaits to varying target velocities in a variety of scenarios: varying coefficients of friction, added payloads up to 15 kg, and variable slopes up to 10 degrees. See demo at: https://youtu.be/4qq5leCI2AI


Reinforcement Learning for Shared Autonomy Drone Landings

Backman, Kal, Kulić, Dana, Chung, Hoam

arXiv.org Artificial Intelligence

Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach comprises of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot's intent and to provide control inputs that augment the user's input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot's tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (n = 28) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4% to 98.2% despite being unaware of the human participants' goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.


Perceptive Locomotion with Controllable Pace and Natural Gait Transitions Over Uneven Terrains

Tan, Daniel Chee Hian, Zhang, Jenny, Michael, null, Chuah, null, Li, Zhibin

arXiv.org Artificial Intelligence

This work developed a learning framework for perceptive legged locomotion that combines visual feedback, proprioceptive information, and active gait regulation of foot-ground contacts. The perception requires only one forward-facing camera to obtain the heightmap, and the active regulation of gait paces and traveling velocity are realized through our formulation of CPG-based high-level imitation of foot-ground contacts. Through this framework, an end-user has the ability to command task-level inputs to control different walking speeds and gait frequencies according to the traversal of different terrains, which enables more reliable negotiation with encountered obstacles. The results demonstrated that the learned perceptive locomotion policy followed task-level control inputs with intended behaviors, and was robust in presence of unseen terrains and external force perturbations. A video demonstration can be found at https://youtu.be/OTzlWzDfAe8, and the codebase at https://github.com/jennyzzt/perceptual-locomotion.